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1 Appl. No. 09/720,086; 102(e): July 23, 2001 

J Dkt. No. 0609 . 4 560002/ JAG/KRM/DJN; Group An Unit: 1642 | 1 
Inventon: Li et al.; Tel: 202/37 1 -2600 . 

Title: DeNovo DNA Cytotine Mettayttrusferue Genet, Polypeptide! 
' tad Uaej Thereof 



Mouse Dnmt3a DNA sequence 



1 

I 


P A A TTfWYV 

G AA 1 1 GGGGG 


A T/^/^ TAP AAA 

GIGGI GGGGG 


A AAA AAAA A A /TAPrPPrTr A PKPrrr AP A 

GLLGGGUGAt GGGGGGGGGG AGAGGGGAGA 


01 


r t r k r^Ar k r*TA a a 
GCCGCCTGAA 


pppp appppt 
GGGGAGGGG 1 


AAAAATAAAA TTTTPPP APP PPTTPAPATP 

GAGGGIGGAG 1 1 1 IGGGAGG GGI IGAGAIG 


101 


A /V* A TO T A TV* 

AGGGTCTA1G 


TTT A AATATT 
1 1 1 AAG ILII 


APATATTAAT T AAA AAA AAA AAAAAAATTA 

AGG 1 0 1 1 GG 1 1 AGAAAG AGG AGGGGAA 1 1 G 


151 


ATTATATA A A 

CTTCTCTGAA 


AAAATAAA A A 

GCCC1CGCAG 


AAAA A A AAAA AAATAA A A AA AAA A AAA TAA 

CCCCACAGCG CCCTCGCAGG GGGAGGGIGG 


201 


AAAAT A ATAA 

CGCCTACTGC 


AA AAA A A TAA 

CGAGGAAIGG 


AA TAA A A AAA APAPPPAAAA APPAPAAPAT 

GGIGGAGGGG GGGGGGGGAG AGGAGGAGGI 


251 


AATATATAA A 

CCTCTCTGGA 


A AAA A AAA A T 

GCGGGAGGAT 


AATAAAAAAA AAAAAAAAAA aaaaaaaaaa 

GATCGAAAGG AAGGAGAGGA ACAGGAGGAG 


301 


AACCGTGGCA 


i AA a a A » AAA 

AGGAAGAGCG 


AAAAAAAAAA AAAAAAAAAA AAAAAAAAAT 

CCAGGAGCCCAGC^CACGGXCCGGAAGGI 


351 


aaaa a AAA/M 

GGGGAGGCCT 


aaaaaa k aaa 

GGCCGGAAGC 


AAA AAA AAAA AAAAATAAAA AAAAATAAAA 

GCAAGCACCC ACCGGTGGAA AGCAGTGACA 


401 


AAAAA A A A A > 

CCCCCAAGGA 


AA/> A A /-v A A T A 

CCCAGCAGTG 


A A/% A A A A A A T ATA IAAAA A T AAAAA 1 AA A A 

ACCACCAAGT CTCAGCCCAT GGCCCAGGAC 


451 


TCTGGCCCCT 


/MA A T A T^AT 

CAGATCTGCT 


a AAA A * TAA A AAATTAAAAA A AAAA* A TAA 

ACCCAATGGA GACTTGGAGA AGCGGAGTGA 


^ /\ 4J 

501 


ACCCCAACCT 


GAGGAGGGGA 


AAAAlAATAA A AAAA AAA A A AATAAAAAAA 

GCCCAGCTGC AGGGCAGAAG GGTGGGGCCC 


551 


A i/%/%TA 1 A AA 

CAGCTGAAGG 


»A*AAAA | AT 

AGAGGGAACT 


AAAAAAAAAA AAAAAAAATA AAAAAATATA 

GAGACCCCAC CAGAAGCCTC CAGAGCTGTG 


601 


A 1 A i I TAAAT 

GAGAATGGCT 


AA TA TA TA A A 

GCTGTGTGAC 


AAAAAAAAAA AA TAA AAAAT A TAA AAA AA A 

CAAGGAAGGC CGTGGAGCCT CTGCAGGAGA 


651 


GGGCAAAGAA 


A 1 A A * A A A A A 

CAGAAGCAGA 


AAA AAA TAA A ATAAATAAAA A TAA A AAAAT 

CCAACATCGA ATCCATGAAA ATGGAGGGCT 


"7A < 

701 


A A AAA A A AAA 

CCCGGGGCCG 


A ATAAA A AAT 

ACTGCGAGGT 


AAATTAAAAT AAAAATAAAA AAT AAA TAA A 

GGCTTGGGCT GGGAGTCCAG CCTCCGTCAG 


751 


AA A AAA A TAA 

CGACCCATGC 


A • AAAATAAA 

CAAGACTCAC 


AT TAA A AAA A AAAAAAAAAT AAT AAA TAA A 

CTTCCAGGCA GGGGACCCCT ACTACATCAG 


80 1 


AAA A AAA AAA 

CAAACGGAAA 


AAAA A TA AAT 

CGGGAlGAGl 


APPTPPPAPP TTPPAAAAPP PAPPPTPAPA 

GGG 1 GGGAGG 1 1 GGAAAAGG GAGGG 1 GAGA 

• 


obi 


AAA A A AAA A A 

AGAAAGGGAA 


AAT A ATTAAA 

00 1 AA 1 1 GGA 


PTA ATP A ATP PTPTPPAAPA PAAPPAPPPP 

GIAAIGAAIG GIGIGGAAGA GAAGGAGGGG 


901 


TATAA AAAAT 

IGIGGAGAGI 


ATAAA A A AAT 

G 1 GAGAAGG 1 


nrnf^r 1 Ap'pw apppptpptp ptptppappa 
GGAGGAGGGG AGGGGIGGIG GIGIGGAGGA 


951 


GCCCACGGAC 


CCTGCTTCTC 


CGACTGTGGC CACCACCCCT GAGCCAGTAG 


1001 


GAGGGGATGC 


TGGGGACAAG 


AATGCTACCA AAGCAGGCGA CGATGAGCCT 


1051 


GAGTATGAGG 


ATGGCCGGGG 


CTTTGGCATT GGAGAGCTGG TGTGGGGGAA 


1101 


ACTTCGGGGC 


TTCTCCTGGT 


GGCCAGGCCG AATTGTGTCT TGGTGGATGA 



FIG. 1 A— 1 
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1151 


CAGGCCGGAG 


CCGAGCAGCT 


GAAGGCACTC GCTGGGTCAT GTGGTTCGGA 


1201 


GATGGCAAGT 


TCTCAGTGGT 


GTGTGTGGAG AAGCTCATGC CGCTGAGCTC 


1251 


CTTCTGCAGT 


GCATTCCACC 


AGGCCACCTA CAACAAGCAG CCCATGTACC 


1301 


GCAAAGCCAT 


CTACGAAGTC 


CTCCAGGTGG CCAGCAGCCG TGCCGGGAAG 


1351 


CTGTTTCCAG 


CTTGCCATGA 


CAGTGATGAA AGTGACAGTG GCAAGGCTGT 


1401 


GGAAGTGCAG 


AACAAGCAGA 


TGATTGAATG GGCCCTCGGT GGCTTCCAGC 


1451 


CCTCGGGTCC 


TAAGGGCCTG 


GAGCCACCAG AAGAAGAGAA GAATCCTTAC 


1501 


AAGGAAGTTT 


ACACCGACAT 


GTGGGTGGAG-eCTGAAGCAG-eTGCTTACGC 


1551 


CCCACCCCCA 


CCAGCCAAGA 


AACCCAGAAA GAGCACAACA GAGAAACCTA 


1601 


AGGTCAAGGA 


GATCATTGAT 


GAGCGCACAA GGGAGCGGCT GGTGTATGAG 


1651 


GTGCGCCAGA 


AGTGCAGAAA 


CATCGAGGAC ATTTGTATCT CATGTGGGAG 


1701 


CCTCAATGTC 


ACCCTGGAGC 


ACCCACTCTT CATTGGAGGC ATGTGCCAGA 


1751 


ACTGTAAGAA 


CTGCTTCTTG 


GAGTGTGCTT ACCAGTATGA CGACGATGGG 


1801 


TACCAGTCCT 


ATTGCACCAT 


CTGCTGTGGG GGGCGTGAAG TGCTCATGTG 


1851 


TGGGAACAAC 


AACTGCTGCA 


GGTGCTTTTG TGTCGAGTGT GTGGATCTCT 


1901 


TGGTGGGGCC 


AGGAGCTGCT 


CAGGCAGCCA JTAAGGAAGA CCCCTGGAAC 


1951 


TGCTACATGT 


GCGGGCATAA 


GGGCACCTAT GGGCTGCTGC GAAGACGGGA 


2001 


AGACTGGCCT 


TCTCGACTCC 


AGATGTTCTT TGCCAATAAC CATGACCAGG 


2051 


AATTTGACCC 


CCCAAAGGTT 


TACCCACCTG TGCCAGCTGA GAAGAGGAAG 


2101 


CCCATCCGCG 


TGCTGTCTCT 


CTTTGATGGG ATTGCTACAG GGCTCCTGGT 


2151 


GCTGAAGGAC 


CTGGGCATCC 


AAGTGGACCG CTACATTGCC TCCGAGGTGT 


2201 


GTGAGGACTC 


CATCACGGTG 


GGCATGGTGC GGCACCAGGG AAAGATCATG 


2251 


TACGTCGGGG 


ACGTCCGCAG 


CGTCACACAG AAGCATATCC AGGAGTGGGG 


2301 


CCCATTCGAC 


CTGGTGATTG 


GAGGCAGTCC CTGCAATGAC CTCTCCATTG 



FIG.1A-2 
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A ^ I* A 

2351 


T *"V ■ M A AAV A A 

TCAACCCTGC 


AAA A a A A A/*V A 

CCGCAAGGGA 


CTTTATGAGG 


GTACTGGCCG CCTCTTCTTT 


A J A J 

2401 


A l/NTTAT A A/*\ 

GAGTTCTACC 


GCCTCCTGCA 


T/N A TAAAAAA 

TGATGCGCGG 


AAA A * AA AAA A A A A TA A TAA 

CCCAAGGAGG GAGATGATCG 


2451 


AAAAT T A T T A 

CCCCTTCTTC 


TAAATAT T T/\ 

TGGCTCTTTG 


AGAATGTGGT 


/\/\/\/\ A Tf\f\f\f\ ATX A ATA AAA 

GGCCATGGGC GTTAGTGACA 


2501 


AGAGGGACAT 


A- T AA AA * T T T 

CTCGCGATTT 


A T T A A A T A T A 

CTTGAGTCTA 


ACCCCGTGAT GATTGACGCC 


2551 


■ ■ ft. aW\. Jl ft f\. T A 

AAAGAAGTGT 


ataA^aa A J, 

CTGCTGCACA 


A Jl ^\ A^\ A A^^A T 

CAGGGCCCGT 


T ■ ATTATAAA AT A ft A A T T A/"\ 

TACTTCTGGG GTAACCTTCC 


2601 


TGGCATGAAC 


1 A A. A\ A T T TAA 

AGGCCTTTGG 


A A TAA A A T A T 

CATCCACTGT 


/-via T/\ A T A A /\ ATAA > AATAA 

GAATGATAAG CTGGAGCTGC 


2651 


a «A *ATAT^\T 

AAGAGTGTCT 


f\ f\ m A. A A A^A.A^\ 

GGAGCACGGC 


AAA A T A AAA A 

AGAATAGCCA 


1 ATTA 1 A A lATA A/^/% A 

AGTTCAGCAA AGTGAGGACC 


2701 


ATTACCACCA 


GGTCAAACTC 


T Ji T" 1, A A m 

TATAAAGCAG 


aaa ft ft ft A ft A, A A A ATT T^\ A^\ 

GGCAAAGACC ^AGCAT T TCCC 


2751 


CGTCTTCATG 


AACGAGAAGG 


AGGACATCCT 


ATAATAA m. AT A ft Jl ft TAA ft ft ft 

GTGGTGCACT GAAATGGAAA 


2801 


GGGTGTTTGG 


CTTCCCCGTC 


CACTACACAG 


* yVATATAA ft ft A ft TA. ft AAAAA. 

ACGTCTCCAA CATGAGCCGC 


2851 


TTGGCGAGGC 


AGAGACTGCT 


GGGCCGATCG 


TAA ft A A A TAA AAA T A A T A AA 

TGGAGCGTGC CGGTCATCCG 


2901 


CCACCTCTTC 


GCTCCGCTGA 


m AA J Jj T A T T T 

AGGAATATTT 


TAATTATATA T ft A A A A ft A ft T 

TGCTTGTGTG TAAGGGACAT 


2951 


GGGGGCAAAC 


TGAAGTAGTG 


ATGATAAAAA 


AGTTAAACAA ACAAACAAAC 


3001 


AAAAAACAAA 


ACAAAACAAT 


AAAACACCAA 


S\ Jl 1 ■ ^\ ■ ^\ j t^\m^\ f\ ft A A ft A 

GAACGAGAGG ACGGAGAAAA 


3051 


GTTCAGCACC 


CAGAAGAGAA 


ft A A A A A ATTT 

AAAGGAATTT 


■ • i A A ft A ft A A ft A A A 1 A A A A A 

AAAGCAAACC ACAGAGGAGG 


3101 


AAAACGCCGG 


A A A A AT T^\ A A 

AGGGCTTGGC 


ATTAAA A A AA 

CTTGCAAAAG 


GGTTGGACAT CATCTCCTGA 


3151 


GTTTTCAATG 


T" T ft A AA.TTA A 

TTAACCTTCA 


ATAAT A T T A 

GTCCTATCTA 


AAA A A/*\ AAAA T ft A A A A A A T A 

AAAAGCAAAA TAGGCCCCTC 


3201 


AAA T T AT T A A 

CCCTTCTTCC 


AAT AAAA TAA 

CCTCCGGTCC 


T A A A A AAAA A 

TAGGAGGCGA 


» A T T T T T A T T TTAT ft A T A T T 

ACTTTTTGTT TTCTACTCTT 


3251 


XT T/\ * f\ A f*f\f\ 

TTTCAGAGGG 


GTTTTCTGTT 


TATT TAAA T T 

TGTTTGGGTT 


TTTATTTATT AATATAAATA 

ITTGTTTCTT GCTGTGACTG 


7 7 A j 

3301 


A A A f\ A A A ^\ A 

AAACAAGAGA 


/\TT A TTA/N A 

GTTATTGCAG 


A ft ft ft ft T A * A T 

CAAAATCAGT 


AAAAAAAAAA AATAAAAATA 

AACAACAAAA AGTAGAAATG 


3351 


A A f TAA ft A A A 

CCTTGGAGAG 


GAAAGGGAGA 


A A AAA A A A A T 

GAGGGAAAAT 


TAT A T A A A A A ATT A A A A T A T 

TCTATAAAAA CTTAAAATAT 


3401 


TGGTTTTTTT 


TTTTTTTCCT 


TTTCTATATA 


TCTCTTTGGT TGTCTCTAGC 


3451 


CTGATCAGAT 


AGGAGCACAA 


ACAGGAAGAG 


AATAGAGACC CTCGGAGGCA 


3501 


GAGTCTCCTC 


TCCCACCCCC 


CGAGCAGTCT 


CAACAGCACC ATTCCTGGTC 



FIG.1A-3 
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3551 ATGCAAAACA GAACCCAACT AGCAGCAGGG CGCTGAGAGA ACACCACACC 

3601 AGACACTTTC TACAGTATTT CAGGTGCCTA CCACACAGGA AACCTTGAAG 

3651 AAAACCAGTT TCTAGAAGCC GCTGT-TACCT-CT-TGT-TTACA GTTTATATAT 

3701 ATATGATAGA TATGAGATAT ATATATATAA AAGGTACTGT TAACTACTGT 

3751 ACATCCCGAC TTCATAATGG TGCTTTCAAA ACAGCGAGAT GAGCAAAGAC 

3801 ATCAGCTTCC GCCTGGCCCT CTGTGCAAAG GGTTTCAGCC CAGGATGGGG 

3851 AGAGGGGAGC AGCTGGAGGG GGTTTTAACA AACTGAAGGA TGACCCATAT 

3901 CACCCCCCAC CCCTGCCCCA TGCCTAGCTT CACCTGCCAA AAAGGGGCTC 

3951 AGCTGAGGTG GTCGGACCCT GGGGAAGCTG AGTGTGGAAT TTATCCAGAC 

4001 TCGCGTGCAA TAACCTTAGA ATATGAATCT AAAATGACTG CCTCAGAAAA 

4051 ATGGCTTGAG AAAACATTGT CCCTGATTTT GAATTCGTCA GCCACGTTGA 

4101 AGGCCCCTTG TGGGATCAGA AATATTCCAG AGTGAGGGAA AGTGACCCGC 

4151 CATTAACCCC NCCTGGAGCA AATAAAAAAA CATACAAAAT GT 



FIG.1A-4 
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Mouse Dnmt3b1 DNA Sequence 



1 


PAATTPPPPP 
bAA I Ibbbbb 


bbbbbbbbl t 


aapppppppa 

AAbbbbbbbA 


APTAAAPPTA 
Mb 1 nMMbb 1 M 


PPPPAPPPAT 
bbbbAbbbA 1 


0 1 


bbbbbbbbbA 


pattppppaa 

bAI 1 bbbb AM 


cccr apaptp 

bbbbAbAb 1 b 


rcrrrccccc 

bbbbbbbbbb 


PPPPPPPAPP. 
bbbbuubnbb 


1U1 


Abbbbbbbbb 


bbA 1 bbbbbb 


rrrrrm ap 

bbbbbbb 1 At 


APPPAPPPTP 
AbbbAbbb 1 b 


APPAPAPPPP 
AbbAbAbbbb 


151 


ppptp apppt 
bbblbAbbbl 


tp tpppapap 
1 b 1 bbb AbAb 


pttppa a app 
bl IbbAAAbb 


TPAPPTATAT 
1 bAbb 1 A 1 A 1 


APPTTTPPAP 
Abb 1 1 1 bbAb 


zUl 


apppppp atp 
AbbbbbbA 1 b 


1 bbbb l-bbbb 


PATPPATAPT 

bA 1 bbA 1 Ab 1 


PPPTTPPPAP 
bbb 1 1 bbbAb 


PA AATPPAPP 
bAAA 1 bbAbb 


z51 


PPPTTPTTTP 

bbb II L 1 II L 


APP A A APA AT 

AbbAAAbAA 1 


P A APPPAPAP 

bAAbbbAbAb 


APPAPAPATP 

AbbAbAbA 1 b 


TP A A TP A APA 
1 bAA 1 b AAbA 


301 


AbAbbblbbb 


Apn/^r 1 /^ T A TP 

Abbbbb 1 A 1 b 


APP AP TPP AT 

AbbAb IbbAI 


TATPPTTA AT 

lAlbbl lAAi 


PPPA APTTPA 

bbbAAb 1 1 bA 


351 


GTGACCAG IC 


PTPAPAPAPP 

b 1 bAbAbAbb 


A APP A TPP TP 

AAbbAlbblb 


bblbAbbbbb 


APTPTTPP AP 

Ab Ibl IbbAb 


4U1 


PPA ATPTPPA 

bbAAIblbtA 


PAPAPPPAPT 

bAbAbbbAb 1 


ptpp ap app a 
b 1 bb AbAbbA 


P AP A PP AP AH 

bAb Abb AbAb 


PPPPPAPPTP 
bbbbt Abb 1 b 


451 


AAbt 1 bbbbb 


ptptpt a apa 
blblbl AAbA 


ppp apptptp 
bbbAbb 1 b 1 b 


P APPPT TP TP 

bAbbbl lb lb 


A ATT APAPPP 

AAI lAbAbbb 


501 


app apatpap 

AGGACATbAb 


APP A/* 1 A TPP A 

AbbAbAlbbA 


p ap ap ap A TP 
bAbAbAbAlb 


A TP A APT APA 
AlbAAb 1 AbA 


TP A TPPP A AT 

1 b A 1 bbb AA 1 


551 


ppptptpat a 
bbblblbAIA 


TTPTA ATPPP 
1 1 b 1 AA 1 bbb 


a a apptpapp 
AAAbb 1 bAbb 


PPTPAPAPPA 

bb lb AbAbbA 


APPAPAPPAP 

AbbAbAbbAb 


501 


pappppptpt 
bAbbbbb 1 b 1 


P A A Af^fYY^r 1 

bAAAbbbbbb 


nnrrr a at 
b 1 b 1 bbb AAb 


PPP APA T APP 
bbbAbA 1 Abb 


AATPPPAPPT 
AA 1 bbbAbb 1 


b5l 


ppappttppa 
bbAbbl IbbA 


p appp a a ap a 
bAbbbAAAbA 


bbb 1 bbbbbA 


PAATPAPPPP 
bAA 1 bAbbbb 


APPTPPPPAP 
Abb 1 bbbbAb 


/Ul 


bb bbbbb Abb 


atptppappa 

A 1 0 1 bbAbb A 


PTAPPPTPTP 
b 1 Abbb 1 b 1 b 


PAPTTTPPPP 
bAb 1 1 1 bbbb 


PTAPPAPPTP 
b 1 AbbAbb ( b 


/ J I 


tpppapappt 

1 bbb Mb Abb 1 


ppappatppt 

bbAbbM 1 bb \ 


PTTPAPPAAP 
b 1 1 bMbbMMb 


PAPPPPATPP 

bAbbbbM 1 UU 


TPATPPPPTP 


801 


CCAGCGTCGA 


CTTCATGGAA 


GAAGTGACAC 


CTAAGAGCGT 


CAGTACCCCA 


851 


TCAGTTGACT 


TGAGCCAGGA 


TGGAGATCAG 


GAGGGTATGG 


ATACCACACA 


901 


GGTGGATGCA 


GAGAGCAGAG 


ATGGAGACAG 


CACAGAGTAT 


CAGGATGATA 


951 


AAGAGTTTGG 


AATAGGTGAC 


CTCGTGTGGG 


GAAAGATCAA 


GGGCTTCTCC 


1001 


TGGTGGCCTG 


CCATGGTGGT 


GTCCTGGAAA 


GCCACCTCCA 


AGCGACAGGC 



FIG.1B-1 
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1051 


CATGCCCGGA 


ATGCGCTGGG 


TACAGTGGTT 


TGGTGATGGC 


AAGTTTTCTG 


1101 


AGATCTCTGC 


TGACAAACTG 


GTGGCTCTGG 


GGCTGTTCAG 


CCAGCACTTT 


1151 


AATCTGGCTA 


CCTTCAATAA 


GCTGGTTTCT 


TATAGGAAGG 


CCATGTACCA 


1201 


CACTCTGGAG 


AAAGCCAGGG 


TTCGAGCTGG 


CAAGACCTTC 


TCCAGCAGTC 


1251 


CTGGAGAGTC 


ACTGGAGGAC 


CAGCTGAAGC 


CCATGCTGGA 


GTGGGCCCAC 


1301 


GGTGGCTTCA 


AGCCTACTGG 


GATCGAGGGC 


CTCAAACCCA 


ACAAGAAGCA 


1351 


ACCAGTGGTT 


AATAAGTCGA 


AGGTGCGTCG 


TTCAGACAGT 


AGGAACTTAG 


1401 


AACCCAGGAG 


ACGCGAGAAC 


AAAAGTCGAA 


GACGCACAAC' 


CAATGACTCT 


1451 


GCTGCTTCTG 


AGTCCCCCCC 


ACCCAAGCGC 


CTCAAGACAA 


ATAGCTATGG 


1501 


CGGGAAGGAC 


CGAGGGGAGG 


ATGAGGAGAG 


CCGAGAACGG 


ATGGCTTCTG 


1551 


AAGTCACCAA 


CAACAAGGGC 


AATCTGGAAG 


ACCGCTGTTT 


GTCCTGTGGA 


1601 


AAGAAGAACC 


CTGTGTCCTT 


CCACCCCCTC 


TTTGAGGGTG 


GGCTCTGTCA 


1651 


GAGTTGCCGG 


GATCGCTTCC 


TAGAGCTCTT 


CTACATGTAT 


GATGAGGACG 


1701 


GCTATCAGTC 


CTACTGCACC 


GTGTGCTGTG 


AGGGCCGTGA 


ACTGCTGCTG 


1751 


TGCAGTAACA 


CAAGCTGCTG 


CAGATGCTTC 


TGTGTGGAGT 


GTCTGGAGGT 


1801 


GCTGGTGGGC 


GCAGGCACAG 


CTGAGGATGC 


CAAGCTGCAG 


GAACCCTGGA 


1851 


GCTGCTATAT 


GTGCCTCCCT 


CAGCGCTGCC 


ATGGGGTCCT 


CCGACGCAGG 


1901 


AAAGATTGGA 


ACATGCGCCT 


GCAAGACTTC 


TTCACTACTG 


ATCCTGACCT 


1951 


GGAAGAATTT 


GAGCCACCCA 


AGTTGTACCC 


AGCAATTCCT 


GCAGCCAAAA 


2001 


GGAGGCCCAT 


TAGAGTCCTG 


TCTCTGTTTG 


ATGGAATTGC 


AACGGGGTAC 


2051 


TTGGTGCTCA 


AGGAGTTGGG 


TATTAAAGTG 


GAAAAGTACA 


TTGCCTCCGA 


2101 


AGTCTGTGCA 


GAGTCCATCG 


CTGTGGGAAC 


TGTTAAGCAT 


GAAGGCCAGA 


2151 


TCAAATATGT 


CAATGACGTC 


CGGAAAATCA 


CCAAGAAAAA 


TATTGAAGAG 


2201 


TGGGGCCCGT 


TCGACTTGGT 


GATTGGTGGA 


AGCCCATGCA 


ATGATCTCTC 



FIG.1B-2 
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2251 TAACGTCAAT CCTGCCCGCA AAGGTTTATA TGAGGGCACA GGAAGGCTCT 
2301 TCTTCGAGTT TTACCACTTG CTGAATTATA CCCGCCCCAA GGAGGGCGAC 
2351 AACCGTCCAT TCTTCTGGAT GTTCGAGAAT GTTGTGGCCA TGAAAGTGAA 
2401 TGACAAGAAA GACATCTCAA GATTCCTGGC ATGTAACCCA GTGATGATCG 
2451 ATGCCATCAA GGTGTCTGCT GCTCACAGGG CCCGGTACTT CTGGGGTAAC 
2501 CTACCCGGAA TGAACAGGCC CGTGATGGCT TCAAAGAATG ATAAGCTCGA 
2551 GCTGCAGGAC TGCCTGGAGT TCAGTAGGAC AGCAAAGTTA AAGAAAGTGC 
2601 AGACAATAAC CACCAAGTCG AACTCCATCA GACAGGGCAA'AAACCAGCTT 
2651 TTCCCTGTAG TCATGAATGG CAAGGACGAC GTTTTGTGGT GCACTGAGCT 
2701 CGAAAGGATC TTCGGCTTCC CTGCTCACTA CACGGACGTG TCCAACATGG 
2751 GCCGCGGCGC CCGTCAGAAG CTGCTGGGCA GGTCCTGGAG TGTACCGGTC 
2801 ATCAGACACC TGTTTGCCCC CTTGAAGGAC TACTTTGCCT GTGAATAGTT 
2851 CTACCCAGGA CTGGGGAGCT CTCGGTCAGA GCCAGTGCCC AGAGTCACCC 
2901 CTCCCTGAAG GCACCTCACC TGTCCCGTTT TTAGCTCACC TGTGTGGGGC 
2951 CTCACATCAC TGTACCTCAG CTTTCTCCTG CTCAGTGGGA GCAGAGCCTC 
3001 CTGGCCCTTG CAGGGGAGCC CCGGTGCTCC CTCCGTGTGC ACAGCTCAGA 
3051 CCTGGCTGCT TAGAGTAGCC CGGCATGGTG CTCATGTTCT CTTACCCTGA 
3101 AACTTTAAAA CTTGAAGTAG GTAGTAAGAT GGCTTTCTTT TACCCTCCTG 
3151 AGTTTATCAC TCAGAAGTGA TGGCTAAGAT ACCAAAAAAA CAAACAAAAA 
3201 CAGAAACAAA AAACAAAAAA AAACCTCAAC AGCTCTCTTA GTACTCAGGT 
3251 TCATGCTGCA AAATCACTTG AGATTTTGTT TTTAAGTAAC CCGTGCTCCA 
3301 CATTTGCTGG AGGATGCTAT TGTGAATGTG GGCTCAGATG AGCAAGGTCA 
3351 AGGGGCCAAA AAAAATTCCC CCTCTCCCCC CAGGAGTATT TGAAGATGAT 
3401 GTTTATGGTT TAAGTCTTCC TGGCACCTTC CCCTTGCTTT GGTACAAGGG FIG. 1 B 3 
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3451 CTGAAGTCCT GTTGGTCTTG TAGCATTTCC CAGGATGATG ATGTCAGCAG 

3501 GGATGACATC ACCACCTTTA GGGCTTTTCC CTGGCAGGGG CCCATGTGGC 

3551 TAGTCCTCAC GAAGACTGGA GTAGAATGTT TGGAGCTCAG GAAGGGTGGG 

3601 TGGAGTGGCC CTCTTCCAGG TGTGAGGGAT ATCAAGGAGG AAGCTTAGGG 

3651 AAATCCATTC CCCACTCCCT CTTGCCAAAT GAGGGGCCCA GTCCCCAACA 

3701 GCTCAGGTCC CCAGAACCCC CTAGTTCCTC ATGAGAAGCT AGGACCAGAA 

3751 GCACATCGTT CCCCTTATCT GAGCAGTGTT TGGGGMCTA CAGTGAAAAC 

3801 CTTCTGGAGA TGTTAAAAGC TTTTTACCCC ACGATAGATT GTGTTTTTAA 

3851 GGGGTGCTTT TTTTAGGGGC ATCACTGGAG ATAAGAAAGC TGCATTTCAG 

3901 AAATGCCATC GTAATGGTTT TTAAACACCT TTTACCTAAT TACAGGTGCT 

3951 ATTTTATAGA AGCAGACAAC ACTTCTTTTT ATGACTCTCA GACTTCTATT 

4001 TTCATGTTAC CATTTTTTTT GTAACTCGCA AGGTGTGGGC TTTTGTAACT 

4051 TCACAGGTGT GGGGAGAGAC TGCCTTGTTT CAACAGTTTG TCTCCACTGG 

4101 TTTCTAATTT TTAGGTGCAA AGATGACAGA TGCCCAGAGT TTACCTTTCT 

4151 GGTTGATTAA AGTTGTATTT CTCTAAAAAA AAAAAAAAAA AAAAA 



FIG. 1 B-4 
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Humon DNMT3A DNA Sequence 



4 
1 






GCCGCGG 


o a oo Kf^/ s r i rT t 

CACCAGGGCG 


ooo nrrrr^r* 


28 


CCGGCCCGAC 


ooo a ^1*0^^^ 


AT A OOO TOO A 

AlAuGGIGGA 


OOO A TOO A AP 


LLlrtLAtLlrA 


78 


CAGGCTGACA 


o k ooo aooot 

GAGGCACCGT 


to aoo ao aoo 
TCACCAGAGG 


OOTOA AO AOO 

GUOAACAU, 


O^O ATOTATr* 


128 


TTTAAGTTTT 


k A/*Tr*T/V*00 

AACTCTCGCC 


TOO A A AO AOO 

TCCAAAGACC 


AOO AT A AT TO 

ACGATAA11C 


OTTOOOOA A A 
1 1 1 UA/t AAA 


178 


A A A A AAA a A a 

GCCCAGCAGC 


a a a a a » a a a a 

CCCCCAGCCC 


/*»AAA A i/N/Nrk/N 

CGCGCAGCCC 


O AOOOTOOOT 

CAGCCTGCCT 


CCCGGCGCCC 


f\ rt 

228 


AGATGCCCGC 


CATGCGCTCC 


AGCGGCCCCG 


OOO AO* OO AO 

GGGACACCAG 


O AOOTOTOOT 

CAGCTCTGCT 


278 


A AuA A A A /'WS/'N/'N 

GCGGAGCGGG 


* AA A aa A AAA 

AGGAGGACCG 


1 A A /\/\ ft /W\ 1 

AAAGGACGGA 


A A A A AAA ft A A 

GAGGAGCAGG 


aoo kt^f^w^r*/* 

AGGAGCCGCG 


328 


T 1 a^%^\ A A *A^\ A aA- 

TGGCAAGGAG 


A A A AA AA A A /*\ 

GAGCGCCAAG 


A A AAA A 

AGCCCAGCAC 


A A AAA A > AAA 

CACGGCACGG 


A A AAXAAAA A 

AAGGTGGGGC 


378 


GGCCTGGGAG 


GAAGCGCAAG 


CACCCCCCGG 


T A A AAA A AAA 

TGGAAAGCGG 


X A ft A ft A A A A A 

TGACACGCCA 


428 


■ . A A « AAATA 

AAGGACCCTG 


AAA T/"\ A TAT/N 

CGGTGATCTC 


CAAGTCCCCA 


TCCATGGCCC 


AOO AOTOAOO 

AGGACTCAGG 


478 


CGCCTCAGAG 


AT A X X A AAA A 

CTATTACCCA 


ft TAAAA 1 ATT 

ATGGGGACTT 


A A A A A A A A A A 

GGAGAAGCGG 


■ATA IAAAAA 

AGTGAGCCCC 


528 


AGCCAGAGGA 


A A A A A AAA AT 

GGGGAGCCCT 


GCTGGGGGGC 


ft A A A /VV/"*/\/\/% 

AGAAGGGCGG 


AAA AAA ft/t/N A 

GGCCCCAGCA 


578 


^\ A AAA A St A A.A 

GAGGGAGAGG 


A T A A A A A Trt A 

GTGCAGCTGA 


A A AA/\T/\/*/\T 

GACCCTGCCT 


A A ft A/NAT/* A A 

GAAGCCTCAA 


GAGCAGTGGA 


628 


AAA TAAAYAA 

AAATGGCTGC 


TA A A A A A A A A 

TGCACCCCCA 


ft A A A AAA AAA 

AGGAGGGCCG 


A A A 1AAAAAT 

AGGAGCCCCT 


OO AO A a/^AAA 

GCAGAAGCGG 


678 


A A AAA A A A A A 

GCAAAGAACA 


A A A A A A A A A A 

GAAGGAGACC 


i iAi T/V* ft ft T 

AACATCGAAT 


A A a f A A ft A ft T 

CCATGAAAAT 


GGAGGGCTCC 


728 


CGGGGCCGGC 


T A /V^A A P too 

1GCGGGGTGG 


OTTOOOOTOO 

CTTGGGCTGG 


OAOTOOAOOO 

GAGTCCAGCC 


TOOOTOAOOO 


778 


GCCCATGCCG 


AGGCTCACCT 


too kf^r^fv^f^/^ 

TCCAGGCGGG 


OO AOOOOT AO 

GGACtCL 1 AC 


T AO A TO AOO A 

1 ALA 1 LAGCA 


ooo 

828 


AGCGCAAGCG 


a /v* ao too 

GGACGAG1GG 


ATAAAAAAAT 


OO AAA AOOO A 

GbAAAAGOGA 


OOOTOaOA Af* 

bbu 1 bAbAAb 


878 


AAAGCCAAGG 


TCAGTGCAGG 


AATGAATGCT 


GTGGAAGAAA 


ACCAGGGGCC 


928 


CGGGGAGTCT 


CAGAAGGTGG 


AGGAGGCCAG 


CCCTCCTGCT 


GTGCAGCAGC 


978 


CCACTGACCC 


CGCATCCCCC 


ACTGTGGCTA 


CCACGCCTGA 


GCCCGTGGGG 


1028 


TCCGATGCTG 


GGGACAAGAA 


TGCCACCAAA 


GCAGGCGATG 


ACGAGCCAGA 



FIG.1C-1 
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1078 


#\ w ■ AA A A A A A 

GTACGAGGAC 


AAA A A A ^\ T 

GGCCGGGGCT 


TTAAA A V T A A 

TTGGCATTGG 


A A iAATAATA 

GGAGCTGGTG 


TGGGGGAAAC 


4 4 A A 

1128 


TAAAAA A A T T 

TGCGGGGCTT 


CTCCTGGTGG 


CCAGGCCGCA 


TTGTGTCTTG 


ATAA 1 TA » /\/\ 

GTGGATGACG 


1178 


A A AsAA^^N A /*\ ^\ /*\ 

GGCCGGAGCC 


A m A A A A ATA * 

GAGCAGCTGA 


A A A A AAAA A 

AGGCACCCGC 


TGGGTCATGT 


GGTTCGGAGA 


1228 


AAAA * * A TTA 

CGGCAAATTC 


TA> A ATAAT^^T 1 

TCAGTGGTGT 


ATA T T A A A ft a 

GTGTTGAGAA 


GCTGATGCCG 


CTGAGCTCGT 


1278 


TTTGCAGTGC 


GTTCCACCAG 


GCCACGTACA 


ACAAGCAGCC 


CATGTACCGC 


1328 


AAAGCCATCT 


ACGAGGTCCT 


GCAGGTGGCC 


AGCAGCCGCG 


A^A A A A * A A A.*f 

CGGGGAAGCT 


1378 


GTTCCCGGTG 


TGCCACGACA 


GCGATGAGAG 


TGACACTGCC 


AAGGCCGTCG 


1428 


AGGTGCAGAA 


CAAGCCCATG 


ATTGAATGGG 


CCCTGGGGGG 


CTTCCAGCCT 


1478 


TCTGGCCCTA 


AGGGCCTGGA 


GCCACCAGAA 


GAAGAGAAGA 


ATCCCTACAA 


1528 


AGAAGTGTAC 


ACGGACATGT 


GGGTGGAACC 


TGAGGCAGCT 


GCCTACGCAC 


1578 


CACCTCCACC 


AGCCAAAAAG 


CCCCGGAAGA 


GCACAGCGGA 


GAAGCCCAAG 


1628 


GTCAAGGAGA 


TTATTGATGA 


GCGCACAAGA 


GAGCGGCTGG 


TGTACGAGGT 


1678 


GCGGCAGAAG 


TGCCGGAACA 


TTGAGGACAT 


CTGCATCTCC 


TGTGGGAGCC 


1728 


TCAATGTTAC 


CCTGGAACAC 


CCCCTCTTCG 


TTGGAGGAAT 


GTGCCAAAAC 


1778 


TGCAAGAACT 


Jfc >fc w w ^p rffc ^p ft 

GCTTTCTGGA 


GTGTGCGTAC 


CAGTACGACG 


j A A. A A,AA^ AT A 

ACGACGGCTA 


1828 


CCAGTCCTAC 


TGCACCATCT 


GCTGTGGGGG 


A A^S TA A AA TA 

CCGTGAGGTG 


CTCATGTGCG 


1878 


GAAACAACAA 


A^TA ATA A * AA 

CTGCTGCAGG 


T A A T T T T A A A 

TGCTTTTGCG 


TAA^ a A T A T A T 

TGGAGTGTGT 


A A A A A T" A T T A 

GGACCTCTTG 


1928 


A TT A A A. A A AA A 

GTGGGGCCGG 


A.A.A^ ATA A A. A, A 

GGGCTGCCCA 


A.AA A AAA A T" T 

GGCAGCCATT 


A A A A A^ A A jT\ 

AAGGAAGACC 


CCTGGAACTG 


1978 


CTACATGTGC 


GGGCACAAGG 


GTACCTACGG 


GCTGCTGCGG 


CGGCGAGAGG 


2028 


ACTGGCCCTC 


CCGGCTCCAG 


ATGTTCTTCG 


CTAATAACCA 


CGACCAGGAA 


2078 


TTTGACCCTC 


CAAAGGTTTA 


CCCACCTGTC 


CCAGCTGAGA 


AGAGGAAGCC 


2128 


CATCCGGGTG 


CTGTCTCTCT 


TTGATGGAAT 


CGCTACAGGG 


CTCCTGGTGC 


2178 


TGAAGGACTT 


GGGCATTCAG 


GTGGACCGCT 


ACATTGCCTC 


GGAGGTGTGT 



FIG.1C-2 
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2228 


GAGGACTCCA 


TCACGGTGGG 


CATGGTGCGG 


CACCAGGGGA 


AGATCATGTA 


2278 


CGTCGGGGAC 


GTCCGCAGCG 


TCACACAGAA 


GCATATCCAG 


GAGTGGGGCC 


2328 


CATTCGATCT 


GGfGATTGGG 


GGCAGTCCCT 


GCAATGACCT 


CTCCATCGTC 


2378 


MCCCTGCTC 


GCAAGGGCCT 


CTACGAGGGC 


ACTGGCCGGC 


TCTTCTTTGA 


2428 


GTTCTACCGC 


CTCCTGCATG 


ATGCGCGGCC 


CAAGGAGGGA 


GATGATCGCC 


2478 


CCTTCTTCTG 


GCTCTTTGAG 


AATGTGGTGG 


CCATGGGCGT 


TAGTGACAAG 


2528 


AGGGACATCT 


CGCGATTTCT 


CGAGTCCAAC 


CCTGTGATGA 


TTGATGCCAA 


2578 


AGAAGTGTCA 


GCTGCACACA 


GGGCCCGCTA 


CTTCTGGGGT 


AACCTTCCCG 


2628 


GTATGAACAG 


GCCGTTGGCA 


TCCACTGTGA 


ATGATAAGCT 


GGAGCTGCAG 


2678 


GAGTGTCTGG 


AGCATGGCAG 


GATAGCCAAG 


TTCAGCAAAG 


TGAGGACCAT 


2728 


TACTACGAGG 


TCAAACTCCA 


TAAAGCAGGG 


CAAAGACCAG 


CATTTTCCTG 


2778 


TCTTCATGAA 


TGAGAAAGAG 


GACATCTTAT 


GGTGCACTGA 


AATGGAAAGG 


2828 


GTATTTGGTT 


TCCCAGTCCA 


CTATACTGAC 


GTCTCCAACA 


TGAGCCGCTT 


2878 


GGGGAGGCAG 


AGACTGCTGG 


GCCGGTCATG 


GAGCGTGCCA 


GTCATCCGCC 


2928 


ACCTCTTCGC 


TCCGCTGAAG 


GAGTATTTTG 


CGTGTGTGTA 


AGGGACATGG 


2978 


GGGCAAACTG 


AGGTAGCGAC 


ACAAAGTTAA 


ACAAACAAAC 


AAAAAACACA 


3028 


AAACATAATA 


AAACACCAAG 


AACATGAGGA 


TGGAGAGAAG 


TATCAGCACC 


3078 


CAGAAGAGAA 


AAAGGAATTT 


AAAACAAAAA 


CCACAGAGGC 


GGAAATACCG 


3128 


GAGGGCTTTG 


CCTTGCGAAA 


AGGGTTGGAC 


ATCATCTCCT 


GATTTTTCAA 


3178 


TGTTATTCTT 


CAGTCCTATT 


TAAAAACAAA 


ACCAAGCTCC 


CTTCCCTTCC 


3228 


TCCCCCTTCC 


CTTTTTTTTC 


GGTCAGACCT 


TTTATTTTCT 


ACTCTTTTCA 


3278 


GAGGGGTTTT 


CTGTTTGTTT 


GGGTTTTGTT 


TCTTGCTGTG 


ACTGAAACAA 


3328 


GAAGGTTATT 


GCAGCAAAAA 


TCAGTAACAA 


AAAATAGTAA 


CAATACCTTG 


3378 


CAGAGGAAAG 


GTGGGAGGAG 


AGGAAAAAAG 


GGAAATTTTT 


AAAGAAATCT 



FIG.1C-3 
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3428 


ATATATTGGG 


TTGTTTTTTT TTTTGTTTTT 


TGTTTTTTTT TTTTGGGTTT 


3478 


TTTTTTTTTA 


CTATATATCT TTTTTTTGTT 


GTCTCTAGCC TGATCAGATA 


3528 


GGAGCACAAG 


CAGGGGACGG AAAGAGAGAG 


ACACTCAGGC GGCAGCATTC 


3578 


CCTCCCAGCC 


ACTGAGCTGT CGTGCCAGCA 


CCATTCCTGG TCACGCAAAA 

vwn i i i vv i vnuvunnnn 


3628 




TTAGCAGCAG GGAGAGGAGA 


ACACCACACA AGACATTTTT 


¥178 


GTAGAGTATT 


TGAGGTGPfT APPAPAPAGG 


AAAPPTTGAA GAAAATPAGT 




TTPTAGAAGf 


PGPTGTTAPP TPTTGTTTAP 

VA2V/ lull r\v_/\v i \f l l VJ l l l r\\j 


AGTTTATATA TATATGATAG 

f\\j i i minin inini Ufl l r\\J 


3778 


ATATGAGATA 


TATATATAAA AGGTACTGTT 


AACTACTGTA CAACCCGACT 




TGATAATGGT 

1 \jr\ 1 rVA 1 \JU 1 


GPTTTPAAAP AGCGAGATGA 


GTAAAAAPAT PAGPTTPPAP 


3878 


GTTGCCTTCT 


GCGCAAAGGG TTTCACCAAG 


GATGGAGAAA GGGAGAGAGC 


3928 


TTGCAGATGG 


CGCGTTCTCA CGGTGGGCTC 


TTCCCCTTGG TTTGTAACGA 


3978 


AGTGAAGGAG 


GAGAACTTGG GAGCCAGGTT 


CTCCCTGCCA AAAAGGGGGC 


4028 


TAGATGAGGT 


GGTCGGGCCC GTGGACAGCT 


GAGAGTGGGA TTCATCCAGA 


4078 


CTCATGCAAT 


AACCCTTTGA TTGTTTTCTA 


AAAGGAGACT CCCTCGGCAA 


4128 


GATGGCAGAG 


GGTACGGAGT CTTCAGGCCC 


AGTTTCTCAC TTTAGCCAAT 


4178 


TCGAGGGCTC 


CTTGTGGTGG GATCAGAACT 


AATCCAGAGT GTGGGAAAGT 


4228 


GACAGTCAAA 


ACCCCACCTG GAGCAAATAA 


AAAAACATAC AAAACGTAAA 


4278 


AAAAAAAAAA 


AAAAAA 
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Human DNMT3B1 DNA Sequence: 

1 GGCCGCGAAT TCGGCACGAG CCCTGCACGG CCGCCAGCCG GCCTCCCGCC 

51 AGCCAGCCCC GACCCGCGGC TCCGCCGCCC AGCCGCGCCC CAGCCAGCCC 

101 TGCGGCAGGA AAGCATGAAG GGAGACACCA GGCATCTCAA TGGAGAGGAG 

151 GACGCCGGCG GGAGGGAAGA CTCGATCCTC GTCAACGGGG CCTGCAGCGA 

201 CCAGTCCTCC GACTCGCCCC CAATCCTGGA GGCTATCCGC ACCCCGGAGA 

251 TCAGAGGCCG AAGATCAAGC TCGOGACTCT CCAAGAGGGA GGTGTCCAGT 

301 CTGCTAAGCT ACACACAGGA CTTGACAGGC GATGGCGACG GGGAAGATGG 

351 GGATGGCTCT GACACCCCAG TCATGCCAAA GCTCTTCCGG GAAACCAGGA 

401 CTCGTTCAGA AAGCCCAGCT GTCCGAACTC GAAATAACAA CAGTGTCTCC 

451 AGCCGGGAGA GGCACAGGCC TTCCCCACGT TCCACCCGAG GCCGGCAGGG 

501 CCGCMCCAT GTGGACGAGT CCCCCGTGGA GTTCCCGGCT ACCAGGTCCC 

551 TGAGACGGCG GGCAACAGCA TCGGCAGGAA CGCCATGGCC GTCCCCTCCC 

601 AGCTCTTACC TTACCATCGA CCTCACAGAC GACACAGAGG ACACACATGG 

651 GACGCCCCAG AGCAGCAGTA CCCCCTACGC CCGCCTAGCC CAGGACAGCC 

701 AGCAGGGGGG CATGGAGTCC CCGCAGGTGG AGGCAGACAG TGGAGATGGA 

751 GACAGTTCAG AGTATCAGGA TGGGAAGGAG TTTGGAATAG GGGACCTCGT 

801 GTGGGGAAAG ATCAAGGGCT TCTCCTGGTG GCCCGCCATG GTGGTGTCTT 

851 GGAAGGCCAC CTCCAAGCGA CAGGCTATGT CTGGCATGCG GTGGGTCCAG 

901 TGGTTTGGCG ATGGCAAGTT CTCCGAGGTC TCTGCAGACA AACTGGTGGC 

951 ACTGGGGCTG TTCAGCCAGC ACTTTAATTT GGCCACCTTC AATAAGCTCG 

1001 TCTCCTATCG AAAAGCCATG TACCATGCTC TGGAGAAAGC TAGGGTGCGA 

1051 GCTGGCAAGA CCTTCCCCAG CAGCCCTGGA GACTCATTGG AGGACCAGCT 

1101 GAAGCCCATG TTGGAGTGGG CCCACGGGGG CTTCAAGCCC ACTGGGATCG 

1151 AGGGCCTCAA ACCCAACAAC ACGCAACCAG TGGTTAATAA GTCGAAGGTG FIG.1D-1 



Sheet 14 of 38 



' AppL No. 09/720,086; 102(e): July 23, 2001 , 
( } Dkl. No. 0609.4560002/JAG/KiM/DJN; Group Art Unit: 1642 [ 
Inventor Li « Td: 202/371-2600 

Title: DtffovoBNA CyUxlne Methyltraasferue Genet, Polypeptide* 
ind Ukj Thereof 



1201 


CGTCGTGCAG 


GCAGTAGGAA 


ATTAGAATCA AGGAAATACG 


AGAACAAGAC 


1251 


TCGAAGACGC 


ACAGCTGACG 


ACTCAGCCAC CTCTGACTAC 


TGCCCCGCAC 


1301 


CCAAGCGCCT 


CAAGACAAAT 


TGCTATAACA ACGGCAAAGA 


COGAGGGGAT 


1351 


GAAGATCAGA 


GCCGAGAACA 


AATGGCTTCA GATGTTGCCA 


ACAACAAGAG 


1401 


CAGCCTGGAA 


GATGGCTGTT 


TGTCTTGTGG CAGGAAAAAC 


CCCGTGTCCT 


1451 


TCCACCCTCT 


CTTTGAGGGG 


GGGCTCTGTC AGACATGCCG 


GGATCGCTTC 


1501 


CTTGAGCTGT 


TTTACATGTA 


TGATGACGAT GGCTATCAGT 


CTTACTGCAC 


1551 


TGTGTGCTGC 


GAGGGCCGAG 


AGCTGCTGCT TTGCAGCAAC 


ACGAGCTGCT 


1601 


GCCGGTGTTT 


CTGTGTGGAG 


TGCCTGGAGG TGCTGGTGGG 


CACAGGCACA 


1651 


GCGGCCGAGG 


CCAAGCTTCA 


GGAGCCCTGG AGCTGCTACA 


TGTGTCTCCC 


1701 


GCAGCGCTGT 


CATGGCGTCC 


TGCGGCGCCG GAAGGACTGG 


AACGTGCGCC 


1751 


TGCAGGCCTT 


CTTCACCAGT 


GACACGGGGC TTGAATACGA 


AGCCCCCAAG 


1801 


CTGTACCCTG 


CCATTCCCGC 


AGCCCGAAGG CGGCCCATTC 


GAGTCCTGTC 


1851 


ATTGTTTGAT 


GGCATCGCGA 


CAGGCTACCT AGTCCTCAAA 


GAGTTGGGCA 


1901 


TAAAGGTAGG 


AAAGTACGTC 


GCTTCTGAAG TGTGTGAGGA 


GTCCATTGCT 


1951 


GTTGGAACCG 


TGAAGCACGA 


GGGGAATATC AAATACGTGA 


ACGACGTGAG 


2001 


GAACATCACA 


AAGAAAAATA 


TTGAAGAATG GGGCCCATTT 


GACTTGGTGA 


2051 


TTGGCGGAAG 


CCCATGCAAC 


GATCTCTCAA ATGTGAATCC 


AGCCAGGAAA 


2101 


GGCCTGTATG 


AGGGTACAGG 


CCGGCTCTTC TTCGAATTTT 


ACCACCTGCT 


2151 


GAATTACTCA 


CGCCCCAAGG 


AGGGTGATGA CCGGCCGTTC 


TTCTGGATGT 


2201 


TTGAGAATGT 


TGTAGCCATG 


AAGGTTGGCG ACAAGAGGGA 


CATCTCACGG 


2251 


TTCCTGGAGT 


GTAATCCAGT 


GATGATTGAT GCCATCAAAG 


TTTCTGCTGC 


2301 


TCACAGGGCC 


CGATACTTCT 


GGGGCAACCT ACCCGGGATG 


AACAGGCCCG 


2351 


TGATAGCATC 


AAAGAATGAT 


AAACTCGAGC TGCAGGACTG 


CTTGGAATAC 


2401 


AATAGGATAG 


CCAAGTTAAA 


GAAAGTACAG ACAATAACCA 


CCAAGTCGAA 



FIG.1D-2 
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2451 CTCGATCAAA CAGGGGAAAA ACCAACTTTT CCCTGTTGTC ATGAATGGCA 

2501 AAGAAGATGT TTTGTGGTGC ACTGAGCTCG AAAGGATCTT TGGCTTTCCT 

2551 GTGCACTACA CAGACGTGTC CAACATGGGC CGTGGTGCCC GCCAGAAGCT 

2601 GCTGGGAAGG TCCTGGAGCG TGCCTGTCAT CCGACACCTC TTCGCCCCTC 

2651 TGAAGGACTA CTTTGCATGT GAATAGTTCC AGCCAGGCCC CAAGCCCACT 

2701 GGGGTGTGTG GCAGAGCCAG GACCCAGGAG GTGTGATTCC TGAAGGCATC 

2?51 CCCAGGCCCT GCTCTTCCTC AGCTGTGTGG GTCATACCGT GTACCTCAGT 

2801 TCCCTCTTGC TCAGTGGGGG CAGAGCCACC TGACTCTTGC AGGGGTAGCC. _ 

2851 TGAGGTGCCG CCTCCTTGTG CACAAATCAG ACCTGGCTGC TTGGAGCAGC 

2901 CTAACACGGT GCTCATTTTT TCTTCTCCTA AAACTTTAAA ACTTGAAGTA 

2951 GGTAGCAACG TGGCTTTTTT TTTTTCCCTT CCTGGGTCTA CCACTCAGAG 

3001 AAACAATGGC TAAGATACCA AAACCACAGT GCCGACAGCT CTCCAATACT 

3051 CAGGTTAATG CTGAAAAATC ATCCAAGACA GTTATTGCAA GAGTTTMTT 

3101 TTTGAAAACT GGGTACTGCT ATGTGTTTAC AGACGTGTGC AGTTGTAGGC 

3151 ATGTAGCTAC AGGACATTTT TAAGGGCCCA GGATCGTTTT TTCCCAGGGC 

3201 AAGCAGAAGA GAAAATGTTG TATATGTCTT TTACCCGGCA CATTCCCCTT 

3251 GCCTAAATAC AAGGGCTGGA GTCTGCACGG GACCTATTAG AGTATTTTCC 

3301 ACAATGATGA TGATTTCAGC AGGGATGACG TCATCATCAC ATTCAGGGCT 

3351 ATTTTTTCCC CCACAAACCC AAGGGCAGGG GCCACTCTTA GCTAAATCCC 

3401 TCCCCGTGAC TGCAATAGAA CCCTCTGGGG AGCTCAGGAA GGGGTGTGCT 

3451 GAGTTCTATA ATATAAGCTG CCATATATTT TGTAGACAAG TATGGCTCCT 

3501 CCATATCTCC CTCTTCCCTA GGAGAGGAGT GTGAAGCAAG GAGCTTAGAT 

3551 AAGACACCCC CTCAAACCCA TTCCCTCTCC AGGAGACCTA CCCTCCACAG 

3601 GCACAGGTCC CCAGATGAGA AGTCTGCTAC CCTCATTTCT CATCTTTTTA 

3651 CTAAACTCAG AGGCAGTGAC AGCAGTCAGG GACAGACATA CATTTCTCAT FIG. 1 D — 3 
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3701 ACCTTCCCCA CATCTGAGAG ATGACAGGGA AAACTGCAAA GCJCGGTGCT 

3751 CCCTTTGGAG ATTTTTTAAT CCTTTTTTAT TCCATAAGAA GTCGTTTTTA 

3801 GGGAGAACGG GAATTCAGAC AAGCTGCATT TCAGAAATGC TGTCATAATG 

3851 GTTTTTAACA CCTTTTACTC TTCTTACTGG TGCTATTTTG TAGAATAAGG 

3901 AACAACGTTG ACAAGTTTTG TGGGGCTTTT TATACACTTT TTAAAATCTC 

3951 AAACTTCTAT TTTTATGTTT AACGTTTTCA TTAAAATTTT TTTGTAACTG 

4001 GAGCCACGAC GTAACAAATA TGGGGAAAAA ACTGTGCCTT GTTTCAACAG 

4051 TTTTTGCTAA TTTTTAGGCT GAAAGATGAC GGATGCCTAG AGTTTACCTT 

4101 ATGTTTAATT AAAATCAGTA TTTGTCTAAA AAAAAAAAAA AAAAA 
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Human DNMT3B1 Protein 

1 MKGDTRHLNG EEDAGGREDS ILVNGACSDQ SSDSPPILEA IRTPEIRGRR 

51 SSSRLSKREV SSLLSYTQDL TGDGDGEDGD GSDTPVfcPKL FRETRTRSES 

101 PAVRTRNNNS VSSRERHRPS PRSTRGRQGR NHVDESPVEF PATRSLRRRA 

151 TASAGTPWPS PPSSYLTIDL TDDTEDTHGT PQSSSTPYAR LAQOSQQGGM 

20 1 ESPQVEAOSG DGDSSE YQOG KEFG IGDLVW GK I KGFSWWP AMWSWKATS 

251 KRQAMSGMRW VQWFGDGKFS EVSADKLVAL GLFSQHFNLA TFNKLVSYRK 

301 AMYHALEKAR VRAGKTFPSS PGOSLEOQLK PMLEWAHGGF KPTGIEGLKP 

351 NNTQPWWS KVRRAGSRKL ESRKYENKTR RRTADDSATS DYCPAPKRLK 

401 TNCYNNGKDR GDEDQSREQM ASDVANNKSS LEDGCLSCGR KNPVSFHPLF 

451 EGGLCQTCRD RFLELFYMYD DDGYQSYCTV CCEGRELLLC SNTSCCRCFC 

501 VECLEVLVGT GTAAEAKLQE PWSCYMCLPQ RCHGVLRRRK DWINVRLQAFF 

551 TSOTGLEYEA PKLYPAIPM RRRPIRVLSL FDGIATGYLV LKELGIKVGK 

601 YVASEVCEES IAVGTVKHEG NIKYVNDVRN ITKKN1EEWG PFDLVIGGSP 

651 CNDLSNVNPA RKGLYEGTGR LFFEFYHLLN YSRPKEGDOR PFFWMFENW 

701 AMKVGDKRDI SRFLECNPVM IDAIKVSAAH RARYFWGNLP GMNRPVIASK 

751 NDKLELQOCL EYNRIAKLKK VQTITTKSNS IKQGKNQLFP Wt/NGKEDVL 

801 WCTELERIFG FPVHYTDVSN MSRGARQKLL {JRSWSVPVIR HLFAPLKDYF 

851 ACE* 
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